AITopics | Kano

Collaborating Authors

Kano

Quantity vs. Quality of Monolingual Source Data in Automatic Text Translation: Can It Be Too Little If It Is Too Good?

Abdulmumin, Idris, Galadanci, Bashir Shehu, Aliyu, Garba, Muhammad, Shamsuddeen Hassan

arXiv.org Artificial IntelligenceOct-17-2024

Monolingual data, being readily available in large quantities, has been used to upscale the scarcely available parallel data to train better models for automatic translation. Self-learning, where a model is made to learn from its output, is one approach to exploit such data. However, it has been shown that too much of this data can be detrimental to the performance of the model if the available parallel data is comparatively extremely low. In this study, we investigate whether the monolingual data can also be too little and if this reduction, based on quality, has any effect on the performance of the translation model. Experiments have shown that on English-German low-resource NMT, it is often better to select only the most useful additional data, based on quality or closeness to the domain of the test data, than utilizing all of the available data.

machine learning, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/NIGERCON54645.2022.9803137

2410.13783

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Switzerland (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(12 more...)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Active Sensing of Knee Osteoarthritis Progression with Reinforcement Learning

Nguyen, Khanh, Nguyen, Huy Hoang, Panfilov, Egor, Tiulpin, Aleksei

arXiv.org Artificial IntelligenceAug-22-2024

Osteoarthritis (OA) is the most common musculoskeletal disease, which has no cure. Knee OA (KOA) is one of the highest causes of disability worldwide, and it costs billions of United States dollars to the global community. Prediction of KOA progression has been of high interest to the community for years, as it can advance treatment development through more efficient clinical trials and improve patient outcomes through more efficient healthcare utilization. Existing approaches for predicting KOA, however, are predominantly static, i.e. consider data from a single time point to predict progression many years into the future, and knee level, i.e. consider progression in a single joint only. Due to these and related reasons, these methods fail to deliver the level of predictive performance, which is sufficient to result in cost savings and better patient outcomes. Collecting extensive data from all patients on a regular basis could address the issue, but it is limited by the high cost at a population level. In this work, we propose to go beyond static prediction models in OA, and bring a novel Active Sensing (AS) approach, designed to dynamically follow up patients with the objective of maximizing the number of informative data acquisitions, while minimizing their total cost over a period of time. Our approach is based on Reinforcement Learning (RL), and it leverages a novel reward function designed specifically for AS of disease progression in more than one part of a human body. Our method is end-to-end, relies on multi-modal Deep Learning, and requires no human input at inference time. Throughout an exhaustive experimental evaluation, we show that using RL can provide a higher monetary benefit when compared to state-of-the-art baselines.

osteoarthritis, prediction, progression, (17 more...)

arXiv.org Artificial Intelligence

2408.02349

Country:

Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > Canada > Ontario > Hamilton (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Rheumatology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Artificial Intelligence for Public Health Surveillance in Africa: Applications and Opportunities

Tshimula, Jean Marie, Kalengayi, Mitterrand, Makenga, Dieumerci, Lilonge, Dorcas, Asumani, Marius, Madiya, Déborah, Kalonji, Élie Nkuba, Kanda, Hugues, Galekwa, René Manassé, Kumbu, Josias, Mikese, Hardy, Tshimula, Grace, Muabila, Jean Tshibangu, Mayemba, Christian N., Nkashama, D'Jeff K., Kalala, Kalonji, Ataky, Steve, Basele, Tighana Wenge, Didier, Mbuyi Mukendi, Kasereka, Selain K., Dialufuma, Maximilien V., Kumwita, Godwill Ilunga Wa, Muyuku, Lionel, Kimpesa, Jean-Paul, Muteba, Dominique, Abedi, Aaron Aruna, Ntobo, Lambert Mukendi, Bundutidi, Gloria M., Mashinda, Désiré Kulimba, Mpinga, Emmanuel Kabengele, Kasoro, Nathanaël M.

arXiv.org Artificial IntelligenceAug-5-2024

Artificial Intelligence (AI) is revolutionizing various fields, including public health surveillance. In Africa, where health systems frequently encounter challenges such as limited resources, inadequate infrastructure, failed health information systems and a shortage of skilled health professionals, AI offers a transformative opportunity. This paper investigates the applications of AI in public health surveillance across the continent, presenting successful case studies and examining the benefits, opportunities, and challenges of implementing AI technologies in African healthcare settings. Our paper highlights AI's potential to enhance disease monitoring and health outcomes, and support effective public health interventions. The findings presented in the paper demonstrate that AI can significantly improve the accuracy and timeliness of disease detection and prediction, optimize resource allocation, and facilitate targeted public health strategies. Additionally, our paper identified key barriers to the widespread adoption of AI in African public health systems and proposed actionable recommendations to overcome these challenges.

africa, outbreak, prediction, (15 more...)

arXiv.org Artificial Intelligence

2408.02575

Country:

Africa > Sub-Saharan Africa (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Africa > West Africa (0.05)
(78 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(6 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(7 more...)

Add feedback

DustNet: skillful neural network predictions of Saharan dust

Nowak, Trish E., Augousti, Andy T., Simmons, Benno I., Siegert, Stefan

arXiv.org Artificial IntelligenceJun-17-2024

Suspended in the atmosphere are millions of tonnes of mineral dust which interacts with weather and climate. Accurate representation of mineral dust in weather models is vital, yet remains challenging. Large scale weather models use high power supercomputers and take hours to complete the forecast. Such computational burden allows them to only include monthly climatological means of mineral dust as input states inhibiting their forecasting accuracy. Here, we introduce DustNet a simple, accurate and super fast forecasting model for 24-hours ahead predictions of aerosol optical depth AOD. DustNet trains in less than 8 minutes and creates predictions in 2 seconds on a desktop computer. Created by DustNet predictions outperform the state-of-the-art physics-based model on coarse 1 x 1 degree resolution at 95% of grid locations when compared to ground truth satellite data. Our results show DustNet has a potential for fast and accurate AOD forecasting which could transform our understanding of dust impacts on weather patterns.

dustnet, forecast, prediction, (17 more...)

arXiv.org Artificial Intelligence

2406.11754

Country:

Africa > West Africa (0.14)
Atlantic Ocean > South Atlantic Ocean > Gulf of Guinea (0.05)
Africa > Gulf of Guinea (0.05)
(17 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Government > Regional Government (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Analyzing COVID-19 Vaccination Sentiments in Nigerian Cyberspace: Insights from a Manually Annotated Twitter Dataset

Ahmad, Ibrahim Said, Aliyu, Lukman Jibril, Khalid, Abubakar Auwal, Aliyu, Saminu Muhammad, Muhammad, Shamsuddeen Hassan, Abdulmumin, Idris, Abduljalil, Bala Mairiga, Bello, Bello Shehu, Abubakar, Amina Imam

arXiv.org Artificial IntelligenceJan-23-2024

Numerous successes have been achieved in combating the COVID-19 pandemic, initially using various precautionary measures like lockdowns, social distancing, and the use of face masks. More recently, various vaccinations have been developed to aid in the prevention or reduction of the severity of the COVID-19 infection. Despite the effectiveness of the precautionary measures and the vaccines, there are several controversies that are massively shared on social media platforms like Twitter. In this paper, we explore the use of state-of-the-art transformer-based language models to study people's acceptance of vaccines in Nigeria. We developed a novel dataset by crawling multi-lingual tweets using relevant hashtags and keywords. Our analysis and visualizations revealed that most tweets expressed neutral sentiments about COVID-19 vaccines, with some individuals expressing positive views, and there was no strong preference for specific vaccine types, although Moderna received slightly more positive sentiment. We also found out that fine-tuning a pre-trained LLM with an appropriate dataset can yield competitive results, even if the LLM was not initially pre-trained on the specific language of that dataset.

sentiment, tweet, vaccine, (12 more...)

arXiv.org Artificial Intelligence

2401.13133

Country:

North America > United States (0.14)
Africa > Nigeria > Kano State > Kano (0.05)
Africa > Nigeria > Federal Capital Territory > Abuja (0.05)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Leveraging Closed-Access Multilingual Embedding for Automatic Sentence Alignment in Low Resource Languages

Abdulmumin, Idris, Khalid, Auwal Abubakar, Muhammad, Shamsuddeen Hassan, Ahmad, Ibrahim Said, Aliyu, Lukman Jibril, Sani, Babangida, Abduljalil, Bala Mairiga, Hassan, Sani Ahmad

arXiv.org Artificial IntelligenceNov-20-2023

The importance of qualitative parallel data in machine translation has long been determined but it has always been very difficult to obtain such in sufficient quantity for the majority of world languages, mainly because of the associated cost and also the lack of accessibility to these languages. Despite the potential for obtaining parallel datasets from online articles using automatic approaches, forensic investigations have found a lot of quality-related issues such as misalignment, and wrong language codes. In this work, we present a simple but qualitative parallel sentence aligner that carefully leveraged the closed-access Cohere multilingual embedding, a solution that ranked second in the just concluded #CoHereAIHack 2023 Challenge (see https://ai6lagos.devpost.com). The proposed approach achieved $94.96$ and $54.83$ f1 scores on FLORES and MAFAND-MT, compared to $3.64$ and $0.64$ of LASER respectively. Our method also achieved an improvement of more than 5 BLEU scores over LASER, when the resulting datasets were used with MAFAND-MT dataset to train translation models. Our code and data are available for research purposes here (https://github.com/abumafrim/Cohere-Align).

aclanthology, computational linguistic, translation, (13 more...)

arXiv.org Artificial Intelligence

2311.12179

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Africa > Nigeria > Kano State > Kano (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language

Parida, Shantipriya, Abdulmumin, Idris, Muhammad, Shamsuddeen Hassan, Bose, Aneesh, Kohli, Guneet Singh, Ahmad, Ibrahim Said, Kotwal, Ketan, Sarkar, Sayan Deb, Bojar, Ondřej, Kakudi, Habeebah Adamu

arXiv.org Artificial IntelligenceMay-28-2023

This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. The dataset was created by manually translating 6,022 English question-answer pairs, which are associated with 1,555 unique images from the Visual Genome dataset. As a result, the dataset provides 12,044 gold standard English-Hausa parallel sentences that were translated in a fashion that guarantees their semantic match with the corresponding visual information. We conducted several baseline experiments on the dataset, including visual question answering, visual question elicitation, text-only and multimodal machine translation.

artificial intelligence, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2305.1769

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Africa > Nigeria > Jigawa State > Dutse (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(32 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.93)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages

Dione, Cheikh M. Bamba, Adelani, David, Nabende, Peter, Alabi, Jesujoba, Sindane, Thapelo, Buzaaba, Happy, Muhammad, Shamsuddeen Hassan, Emezue, Chris Chinenye, Ogayo, Perez, Aremu, Anuoluwapo, Gitau, Catherine, Mbaye, Derguene, Mukiibi, Jonathan, Sibanda, Blessing, Dossou, Bonaventure F. P., Bukula, Andiswa, Mabuya, Rooweither, Tapo, Allahsera Auguste, Munkoh-Buabeng, Edwin, Koagne, victoire Memdjokam, Kabore, Fatoumata Ouoba, Taylor, Amelia, Kalipe, Godson, Macucwa, Tebogo, Marivate, Vukosi, Gwadabe, Tajuddeen, Elvis, Mboning Tchiaze, Onyenwe, Ikechukwu, Atindogbe, Gratien, Adelani, Tolulope, Akinade, Idris, Samuel, Olanrewaju, Nahimana, Marien, Musabeyezu, Théogène, Niyomutabazi, Emile, Chimhenga, Ester, Gotosa, Kudzai, Mizha, Patrick, Agbolo, Apelete, Traore, Seydou, Uchechukwu, Chinedu, Yusuf, Aliyu, Abdullahi, Muhammad, Klakow, Dietrich

arXiv.org Artificial IntelligenceMay-23-2023

In this paper, we present MasakhaPOS, the largest part-of-speech (POS) dataset for 20 typologically diverse African languages. We discuss the challenges in annotating POS for these languages using the UD (universal dependencies) guidelines. We conducted extensive POS baseline experiments using conditional random field and several multilingual pre-trained language models. We applied various cross-lingual transfer models trained with data available in UD. Evaluating on the MasakhaPOS dataset, we show that choosing the best transfer language(s) in both single-source and multi-source setups greatly improves the POS tagging performance of the target languages, in particular when combined with cross-lingual parameter-efficient fine-tuning methods. Crucially, transferring knowledge from a language that matches the language family and morphosyntactic properties seems more effective for POS tagging in unseen languages.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.13989

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Greater London > London (0.14)
Africa > Niger (0.05)
(31 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

HERDPhobia: A Dataset for Hate Speech against Fulani in Nigeria

Aliyu, Saminu Mohammad, Wajiga, Gregory Maksha, Murtala, Muhammad, Muhammad, Shamsuddeen Hassan, Abdulmumin, Idris, Ahmad, Ibrahim Said

arXiv.org Artificial IntelligenceNov-28-2022

Social media platforms allow users to freely share their opinions about issues or anything they feel like. However, they also make it easier to spread hate and abusive content. The Fulani ethnic group has been the victim of this unfortunate phenomenon. This paper introduces the HERDPhobia - the first annotated hate speech dataset on Fulani herders in Nigeria - in three languages: English, Nigerian-Pidgin, and Hausa. We present a benchmark experiment using pre-trained languages models to classify the tweets as either hateful or non-hateful. Our experiment shows that the XML-T model provides better performance with 99.83% weighted F1. We released the dataset at https://github.com/hausanlp/HERDPhobia for further research.

machine learning, natural language, tweet, (19 more...)

arXiv.org Artificial Intelligence

2211.15262

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
North America > Dominican Republic (0.05)
(4 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep Sequence Models for Text Classification Tasks

Abdullahi, Saheed Salahudeen, Yiming, Sun, Muhammad, Shamsuddeen Hassan, Mustapha, Abdulrasheed, Aminu, Ahmad Muhammad, Abdullahi, Abdulkadir, Bello, Musa, Aliyu, Saminu Mohammad

arXiv.org Artificial IntelligenceJul-18-2022

The exponential growth of data generated on the Internet in the current information age is a driving force for the digital economy. Extraction of information is the major value in an accumulated big data. Big data dependency on statistical analysis and hand-engineered rules machine learning algorithms are overwhelmed with vast complexities inherent in human languages. Natural Language Processing (NLP) is equipping machines to understand these human diverse and complicated languages. Text Classification is an NLP task which automatically identifies patterns based on predefined or undefined labeled sets. Common text classification application includes information retrieval, modeling news topic, theme extraction, sentiment analysis, and spam detection. In texts, some sequences of words depend on the previous or next word sequences to make full meaning; this is a challenging dependency task that requires the machine to be able to store some previous important information to impact future meaning. Sequence models such as RNN, GRU, and LSTM is a breakthrough for tasks with long-range dependencies. As such, we applied these models to Binary and Multi-class classification. Results generated were excellent with most of the models performing within the range of 80% and 94%. However, this result is not exhaustive as we believe there is room for improvement if machines are to compete with humans.

classification, classification task, sequence model, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICECCE52056.2021.9514261

2207.0888

Country:

North America > United States (0.14)
Africa > Nigeria > Kaduna State > Kaduna (0.05)
Europe > Portugal > Porto > Porto (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback